On the Computability of Infinite-Horizon Partially Observable Markov Decision Processes

نویسنده

  • Omid Madani
چکیده

We investigate the computability of infinite-horizon partially observable Markov decision processes under discounted and undiscounted optimality criteria. The undecidability of the emptiness problem for probabilistic finite automata is used to show that a few technical problems, such as the isolation of a threshold, and closely related undiscounted problems such as probabilistic planning are undecidable. The decidability of corresponding problems under the discounted criterion remains largely open, but we provide evidence for decidabilility of several, while we also give evidence of hardness as there may be no closed-forms for describing optimal sequences of actions. The research sheds light on some interesting structural properties of these problems. We investigate the computability of infinite-horizon partially observable Markov decision processes (POMDPs). These problems form the basic model for closely related problems in the area of probabilistic planning. Their computability had been questioned or conjectured before (see for example [PT87] and [Lit96]). To simplify and focus on important properties of the problems, we will concentrate on unobservable MDPs, or UMDPs. Of course, any hardness result shown applies to the more general class of POMDPs as well. In Section 1, we give a brief introduction to the models, several infinite-horizon discounted and undiscounted optimality criteria, notions of optimal policies, values and action sequences, and the computational problems of interest. In Section 2, we describe the emptiness problem for probabilistic finite automata (PFA’s) and its significance for computational problems of UMDPs under undiscounted optimality criteria. Surprisingly the emptiness problems for PFA’s is undecidable [CL89, Paz71]. We explain in some detail how the undecidability result in [CL89] is established, and explore consequences of the result, including the undecidability of a few related technical problems such as the isolation of a threshold, in addition to undecidability of probabilistic planning in its general form. Section 3 concerns the UMDP model in the presence of discounting, which adds an interesting twist to the problems. One view of discounting is that in its presence, the dynamics of the model terminate with probability one. Hence, while the horizon is still infinite, these models lie semantically in between finite-horizon models and undiscounted infinite horizon models. The decision problems also appear to be easier computationally than the corresponding ones in the

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems

We investigate the computability of problems in probabilistic planning and partially observable infinite-horizon Markov decision processes. The undecidability of the string-existence problem for probabilistic finite automata is adapted to show that the following problem of plan existence in probabilistic planning is undecidable: given a probabilistic planning problem, determine whether there ex...

متن کامل

On the Undecidability of Probabilistic Planning and Innnite-horizon Partially Observable Markov Decision Problems

We investigate the computability of problems in probabilistic planning and partially observable innnite-horizon Markov decision processes. The undecidability of the string-existence problem for probabilistic nite automata is adapted to show that the following problem of plan existence in probabilistic planning is undecidable: given a probabilistic planning problem, determine whether there exist...

متن کامل

On the Undecidability of Probabilistic Planning and In nite-Horizon Partially Observable Markov Decision Problems

We investigate the computability of problems in probabilistic planning and partially observable innnite-horizon Markov decision processes. The undecidability of the string-existence problem for probabilistic nite automata is adapted to show that the following problem of plan existence in probabilistic planning is undecidable: given a probabilistic planning problem, determine whether there exist...

متن کامل

Optimal Control of Infinite Horizon Partially Observable Decision Processes Modeled As Generators of Probabilistic Regular Languages

Decision processes with incomplete state feedback have been traditionally modeled as Partially Observable Markov Decision Processes. In this paper, we present an alternative formulation based on probabilistic regular languages. The proposed approach generalizes the recently reported work on language measure theoretic optimal control for perfectly observable situations and shows that such a fram...

متن کامل

Optimal control of infinite horizon partially observable decision processes modelled as generators of probabilistic regular languages

Decision processes with incomplete state feedback have been traditionally modelled as partially observable Markov decision processes. In this article, we present an alternative formulation based on probabilistic regular languages. The proposed approach generalises the recently reported work on language measure theoretic optimal control for perfectly observable situations and shows that such a f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998